Learning a Ground Truth Ranking Using Noisy Approval Votes
نویسندگان
چکیده
We consider a voting scenario where agents have opinions that are estimates of an underlying common ground truth ranking of the available alternatives, and each agent is asked to approve a set with her most preferred alternatives. We assume that estimates are implicitly formed using the well-known Mallows model for generating random rankings. We show that k-approval voting — where all agents are asked to approve the same number k of alternatives and the outcome is obtained by sorting the alternatives in terms of their number of approvals — has exponential sample complexity for all values of k. This negative result suggests that an exponential (in terms of the number of alternatives m) number of agents is always necessary in order to recover the ground truth ranking with high probability. In contrast, by just asking each agent to approve a random number of alternatives, the sample complexity improves dramatically: it now depends only polynomially on m. Our results may have implications on the effectiveness of crowdsourcing applications that ask workers to provide their input by approving sets of available alternatives.
منابع مشابه
Electing the Most Probable Without Eliminating the Irrational: Voting Over Intransitive Domains
Picking the best alternative in a given set is a well-studied problem at the core of social choice theory. In some applications, one can assume that there is an objectively correct way to compare the alternatives, which, however, cannot be observed directly, and individuals’ preferences over the alternatives (votes) are noisy estimates of this ground truth. The goal of voting in this case is to...
متن کاملAssessing Quality of Product Reviews
In the past few years, there has been an increasing interest in mining opinions from product reviews [3][4][5]. However, due to the lack of editorial and quality control, reviews on products vary greatly in quality. Thus, it is crucial to have a mechanism capable of assessing the quality of reviews and detecting low-quality and noisy reviews. Some shopping sites already provide a function of as...
متن کاملSync-Rank: Robust Ranking, Constrained Ranking and Rank Aggregation via Eigenvector and Semidefinite Programming Synchronization
Abstract. We consider the classic problem of establishing a statistical ranking of a set of n items given a set of inconsistent and incomplete pairwise comparisons between such items. Instantiations of this problem occur in numerous applications in data analysis (e.g., ranking teams in sports data), computer vision, and machine learning. We formulate the above problem of ranking with incomplete...
متن کاملA When Do Noisy Votes Reveal the Truth?
A well-studied approach to the design of voting rules views them as maximum likelihood estimators; given votes that are seen as noisy estimates of a true ranking of the alternatives, the rule must reconstruct the most likely true ranking. We argue that this is too stringent a requirement, and instead ask: How many votes does a voting rule need to reconstruct the true ranking? We define the fami...
متن کاملUsing community structure detection to rank annotators when ground truth is subjective
Learning using labels provided by multiple annotators has attracted a lot of interest in the machine learning community. With the advent of crowdsourcing cheap, noisy labels are easy to obtain. This has raised the question of how to assess annotator quality. Prior work uses bayesian inference to estimate consensus labels and obtain annotator scores based on expertise; the key assumptions are th...
متن کامل